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INTRODUCTION 

The  CYRUS  system  Is  a natural  language  data-base  query  system 
containing  biographical  Information  about  Cyrus  Vance,  Secretary  of 
State  of  the  United  States-  It  also  contains  additional  Information 
associated  with  Vance  that  may  be  needed  in  order  to  answer  questions 
about  him,  such  as  current  events  and  limited  biographical  information 
about  people  he  has  been  in  close  contact  with-  Questions  Input  In 
English  are  answered  by  accessing  conceptual  structures  containing 
this  Information. 

Humans  are  good  processors  of  conceptual  Information.  Upon 
hearing  a natural  language  utterance,  they  extract  both  the  implicit 
and  explicit  content  of  the  utterance,  combine  this  with  contextual 
Information  to  further  refine  the  content,  integrate  the  concepts 
Involved  with  what  they  already  know,  and  make  the  appropriate  answer. 
Later,  they  can  retrieve  the  content  of  that  utterance  from  memory  and 
use  It  to  understand  something  else  they  hear,  making  the  appropriate 
assumptions  and  Inferences,  and  supplying  an  appropriate  answer. 

*This  work  was  supported  In  part  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  and  monitored  under  the  Office  of  Naval 
Research  under  contract  N0001A-75-C-111 1. 


CYRUS  Is  an  attempt  to  model  the  wny  human  memory  might  bo  organized 


to  do  the  task  Just  described*  It  Is  designed  to  extract  both  meaning 


and  Intent  from  natural  language  Input*  and  use  that  Input  either  to 


update  Its  memory  or  to  answer  a question*  It  Is  the  organization  of 


lory  that  is  of  primary  importance  in  describing  the  task  of  adding 


new  Information  or  retrieving  Information  that  is  already  there 


This  paper  addresses  the  problems  of  knowledge  organization  and 


retrieval  of  information  in  the  data-base*  The  conceptual  structures 


used  to  represent,  organize,  and  retrieve  information  are  described 


as  well  as  the  procedures  used  to  answer  natural  language  questions 


Specific  event  information  about  the  persons  represented  In  the 


so  must  world  events 


general  world 


data-base  must  be  stored 


knowledge,  and  rules  about  Inference  and  reference  resolution.  The 


first  half  of  the  paper  will  describe  and  explain  some  of  the 


structures  used  for  organizing  this  knowledge 


the  second  half  will 


show  how  CYRUS  makes  use  of  these  structures 


A sample  question-answering  dialogue  with  the  system  follows*  It 


Illustrates  some  of  the  Information  stored  in  the  data-base,  some  of 


the  types  of  questions  the  system  can  answer,  and  some  of  the 


reference  problems  it  can  solve 


Who  Is  Cyrus  Vance? 

SECRETARY  OF  STATE  OF  THE  UNITED  STATES. 
How  did  Vance  become  Secretary  of  State? 
HE  WAS  APPOINTED  BY  PRESIDENT  CARTER. 

Who  Is  Grace  Vance? 

CYRUS  VANCE'S  WIFE* 

Do  they  have  any  kids? 

YES,  FIVE. 

How  old  is  Vance? 

61. 

Where  was  he  born? 


A6:  CLARKSBURG,  VEST  VIRGINIA. 

Q7:  Who  are  his  parents? 

A7:  AMT  ROBERTS  VANCE  AND  JOHN  CARL  VANCE. 

Q8:  Where  did  he  go  to  college? 

A8:  TALE  UNIVERSITT. 

Q9:  Where  was  Vance  last  week? 

A9:  IN  THE  UNITED  STATES. 

Q10:  Where  la  he  now? 

A10:  IN  ISRAEL. 

Qll:  Why  did  he  go  there? 

All:  TO  DISCUSS  ARAB-ISRAELI  RELATIONS  WITH  THE  EGYPTIAN 
AND  ISRAELI  FOREIGN  MINISTERS. 

Q12:  How  long  trill  he  be  there? 

A12:  SIX  DATS. 

Q13:  Where  will  he  go  afterwards? 

A13:  TO  EGTPT. 


— ‘ 
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MOTIVATION  FOR  CYRUS 


In  any  data-base,  a major  problem  la  how  to  represent  the 
information  to  be  stored  and  hov  to  systematically  organize  and 
lnter-relate  that  Information  so  that  the  data-base  can  be  efficiently 
updated  and  queried • In  a very  large  data-base,  some  information 
might  not  be  explicitly  In  the  data-base,  but  could  be  there 
Implicitly  as  an  Inference  from  something  explicit*  In  order  to 
retrieve  Information  that  Is  only  Implicitly  in  the  data-base.  It  Is 
necessary  to  efficiently  organize  both  the  Information  In  memory  and 
the  procedures  which  access  and  perform  Inferences  on  that 
information* 

A second  problem  of  data-base  systems  Is  specifying  a query 
language*  Natural  language  is  the  most  useful  language  for  querying  a 
data-base  In  that  it  does  not  require  the  questioner  to  have  any 
special  training*  It  makes  the  data-base  accessible  to  everyone. 
Communicating  to  a data-base  in  natural  language,  whether  for  updating 
or  for  retrieval  of  Information,  requires  that  the  system  understand 
the  meaning,  intent,  and  other  Implications  of  the  natural  language 
utterance.  Inferences  that  must  be  made  from  a natural  language 
utterance  must  be  organized  systematically  to  avoid  an  exponential 
explosion  upon  Implementation. 

In  the  CYRUS  system,  ve  have  used  instantiations  of  conceptual 
frames  and  knowledge  structures  (such  as  scripts)  to  represent  and 
organize  specific  events  In  memory.  These  knowledge  structures  are 
also  used  to  organize  Inferences  necessary  for  natural  language 
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dialogue  with  the  deta-base,  for  both  the  updating  and  the  retrieval 
processes. 

A data-base  meant  to  be  updated  and  queried  In  natural  language 
should  be  organized  taking  into  account  the  peculiarities  of  natural 
language.  Most  data-bases  have  not  been  designed  in  this  way,  and 
have  dealt  with  natural  language  Inquiry  only  through  natural  language 
front  ends  specific  to  each  data-base.  Such  front-end  Interface  is 
typically  added  on  after  the  data-base  has  been  designed,  thus 
ignoring  the  fact  that  efficient  natural  language  Inquiry  requires  a 
data-base  specifically  designed  for  that  purpose.  CYRUS  Is  a 
data-base  designed  for  natural  language  updating  and  querying.  Both 
the  nature  of  the  information  In  the  data-base  and  the  wide  range  of 
possibilities  In  natural  language  dialogue  have  been  taken  Into 
account  in  designing  CYRUS. 

In  understanding  natural  language.  It  is  necessary  to  use 
contextual  Information  and  knowledge  about  the  Intent  of  the  speaker. 
Because  the  Implications  and  Intent  of  an  utterance  are  all  Included 
in  Its  meaning.  Its  internal  representation  must  Include  Implications 
and  Intent.  CYRUS  does  this  by  using  conceptual  representations, 
including  Conceptual  Dependency  (CD),  scripts,  and  role  themes  (Schank 
and  Abelson,  1977).  These  structures,  along  with  a new  conceptual 
structure  called  an  era,  are  used  to  represent  and  organize 
Information  In  memory,  and  also  to  guide  the  processing  of  new 
information  — either  to  update  memory  or  to  answer  questions. 


Lexical  representations,  which  use  English  words  to  represent  and 


retrieve  information,  do  not  offer  the  flexibility  needed  to  deal  with 


conceptual  knowledge.  For  example,  suppose  a lexical  data-base  had 


the  Information  "John  has  a red  car 


question  "Who  can  drive  to  work  tomorrow?"  A data-base  representation 


based  on  words  could  answer  that  question  only  by  linking  the  words 


car"  and  "driving"  in  a way  that  is  not  generalizable  to  other 


domains.  A conceptual  data-base,  on  the  other  hand,  would  use  the 


meaning  of  the  question  and  the  Inferences  associated  with  that 


meaning  to  answer  the  question.  It  would  not  be  concerned  that  the 


words  do  not  match,  because  the  representations  In  the  data-base  and 


the  Internal  representation  of  the  question  would  not  be  based  on 


words,  but  rather  on  meaning.  A conceptual  data-base  would  construct 


a canonical  meaning  representation  of  the  question  to  which  Inferences 


could  be  efficiently  associated.  Different  utterances  with  the  same 


meaning  would  be  associated  with  the  same  representation  and 


inferences.  A conceptual  data-base  must  be  able  to  make  inferences 


from  the  meaning  of  a question  — it  must  efficiently  figure  out  the 


Intent  of  the  question  and  the  correct  place  in  the  data-base  to  look 


It  should  slso  be  generalizable  enough  to  add 


arbitrary  new  domains  easily 


A number  of  relational  data-bases  and  question-answering  systems 


that  use  symbolic  logic,  deductive  reasoning,  and  statistical  methods 


have  been  written  (for  examples,  see  Petrlck  (1975),  Valtz  (1976),  and 


McSklmln  and  Mlnker  (1977)).  While  all  of  these  systems  have  produced 


working  models,  they  have  not  been  developed  as  generalized  systems 


for  dealing  with  conceptual  Information,  and  have  been  deficient  in 


one  or  acre  of  Che  problem  areas  mentioned  above* 

In  addition  Co  the  Interest  In  Improving  data-base  design,  CYRUS 
was  also  aotivated  by  ongoing  work  at  Yale.  SAM  (Culllngford,  1978) 
and  TRUMP  (Be Jong,  1977)  are  two  programs  that  understand  newspaper 
stories  by  using  scripts.  SAM  reads  In  detail,  using  knowledge  of  a 
script  to  fill  in  missing  details  of  the  story.  FRUMP  skims  news 
stories  from  the  UPI  wire  and  produces  summaries.  In  working  with 
these  systems,  it  became  apparent  that  we  would  benefit  from  a general 
memory  model  which  would  know  outside  information  that  SAM  did  not 
know,  but  which  could  be  consulted  by  SAM.  For  example.  In  stories 
about  VIP's,  SAM  needed  to  know  certain  information  about  these  VIP's 
— their  names,  functions,  family  relations,  where  they  were  from, 
etc.  — in  order  to  understand  the  stories*  However,  there  is  no 
reason  why  SAM,  which  is  a script  specialist,  should  have  to  store 
that  information*  Such  Information  should  be  stored  in  a separate 
memory  module  which  SAM,  or  any  other  program  needing  that  Information 
(such  as  FRUMP  or  PAM  (Wllensky,  1978)  or  the  conceptual  ?nalyzer 
(Riesbeck,  1975))  could  query. 

CYRUS  was  designed  to  Integrate  Information  produced  by  FRUMP 
(which  processes  Input  from  the  UPI  wire)  Into  a data-base  that  could 
be  continually  updated.  Vance  was  chosen  as  the  subject  of  the  memory 


model  because  he  Is  In  the  news  so  often  and  will  provide  a large 


number  of  news  updates 
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STRUCTURES  IN  CYRUS  FOR  ORGANIZING  EVENTS 


Representations  of  events  in  CYRUS  take  the  form  of  scripts  and 
Conceptual  Dependency  representations.  Scripts  and  CD  are  used  to 
represent  events  in  memory,  but  they  are  not  representatlonally 
adequate  for  organizing  all  the  biographical  information  we  know  about 
someone.  Events  in  memory  must  be  organized  in  a way  that  makes  both 
the  retrieval  of  an  event  and  the  addition  of  a new  event  fast  and 
efficient.  Information  in  memory  must  be  organized  so  that  memory  can 
be  searched  in  the  most  optimal  way.  In  answering  a question  about 
Vance's  law  career,  we  don't  care  about  his  experiences  in  high 
school;  we  want  to  be  able  to  look  directly  in  the  right  place  in  the 
data-base  to  get  the  answer.  An  era  is  a type  of  memory  structure 
designed  for  organizing  biographical  events  in  memory. 

Eras  represent  time  spans  in  a person's  life  that  are 
characterized  by  one  outstanding  occupational,  familial,  or  social 
role.  For  instance,  a person's  four  years  in  high  school  constitutes 
an  era.  Hla  primary  role  is  that  of  high-school  student;  his  primary 
activity  during  that  time  is  attending  high  school  every  day  (or  doing 
the  high-school  script  every  day).  Periods  of  going  to  college  or 
professional  school,  being  in  the  military  service,  being  a politician 
or  businessman,  or  being  a parent  are  other  examples  of  eras. 

A role  theme  (Schank  and  Abelson,  1977)  is  the  information 
auggested  by  a person's  occupational,  social,  family,  or  situational 
role.  Examples  of  role  themes  are?  student,  businessperson. 


•i 


professional,  politician,  parent,  husband,  and  wife.  For  example,  if 
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we  know  that  someone  is  a student,  then  we  assume  that  person  goes  to 
classes  during  the  day,  does  homework.  Is  probably  not  very  wealthy, 
probably  does  not  work  full  time  but  may  work  part  time,  that  he  or 
she  has  a course  of  study,  may  be  training  for  a particular  type  of 
job,  and  so  forth.  The  student  role  theme  contains  all  of  that 
information* 

A role  theme  may  have  sub-role-themes  associated  with  it.  For 
Instance,  one  sub-role-theme  of  the  student  role  theme  is  high-school 
student,  another  is  college  student,  and  another  is  graduate  student. 
If  we  know  somebody  is  a graduate  student,  we  assume  student  role 
theme  information  about  him,  and  in  addition  we  assume  role-theme 
information  specific  to  graduate  students:  they  are  usually  supported 
by  fellowships,  they  keep  odd  hours,  they  may  be  expected  to  do  some 
teaching,  they  do  Independent  research,  they  have  a strong  Interest  in 
their  field  of  study,  and  they  are  preparing  for  a job  in  that  field. 
We  assume  role  theme  information  when  we  hear  that  a person  is  living 
in  that  role  and  we  have  no  other  contradictory  knowledge  about  that 
person's  Involvement  in  the  role.  If  a person  deviates  from  the 
stereotypic  role  theme,  then  we  must  use  our  specific  knowledge  of  his 
unique  situation  for  understanding. 


Eras  contain  and  organize  all  the  biographical  Information 
related  to  a specific  role  theme  In  a person's  life,  and  are  named  for 
that  particular  role  theme.  Since  many  role  themes  occur 


simultaneously  In  a person's  life,  many  eras  can  exist  simultaneously. 
For  example,  a married  businessperson  occupies  at  least  two  separate 
role  themes  — during  the  day  nt  work,  he  or  she  is  in  the 
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businessperson  role  theme,  and  at  hone  with  the  faally,  he  or  she  Is 
In  the  husband  or  wife  (or  possibly  also  the  parent)  role  theae. 
Events  associated  with  work  would  be  stored  In  an  occupational  era  for 
that  person,  while  those  associated  with  being  In  the  spouse  role 
theae  would  be  stored  In  s faally  era.  In  CYRUS,  eras  are  defined  by 
aajor  occupational  (Including  educational),  faally,  and  social  role 
theaes.  Thus,  each  person  has  parallel  sequences  of  occupational, 
faally,  and  social  eras,  with  each  era  characterized  by  a particular 
occupational,  faally,  or  social  role  theae. 


USING  ROLE  THEMES  AND  ERAS  IN  UNDERSTANDING 

Role  theae  and  era  knowledge  Is  useful  In  retrieving  Information 
from  memory.  Suppose  we  want  to  answer  the  question  "Has  Vance  ever 
won  an  election?"  Script,  role  theme,  and  era  knowledge  are  all  used 
In  answering  this  question.  The  election  script,  mentioned  in  the 
question.  Is  usually  done  by  someone  In  a political  role  theme,  and 
would  therefore  be  found  in  a political  era.  Processing  In  the 
election  script  would  tell  us  to  look  in  Vance's  political  eras  for 
the  answer.  Events  during  those  portions  of  his  life  can  be  searched 
for  the  answer.  Political  role  themes  found  In  his  political  eras  can 
be  searched  to  see  If  one  required  an  election.  In  this  case,  the 
political  role  themes  that  Vance  has  assumed  have  been  advisor  to  the 
president,  representative  at  peace  talks,  and  Secretary  of  State.  He 
can  look  at  the  enablement  or  entry  conditions  for  each  of  those  role 
themes  to  see  If  any  were  elected  positions.  In  this  case,  none  were. 
We  have  not  found  an  instance  of  Vance  winning  a political  election. 


- 
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However,  further  processing  In  the  election  script  tells  us  that 
people  can  win  elections  In  school  or  social  organisations,  and  that 
we  should  therefore  look  in  eras  when  Vance  was  in  school,  or  in 
social  eras,  to  find  an  Instance  of  him  winning  an  election*  A 
similar  procedure  for  these  two  era  types  provides  the  answer. 

Knowledge  about  eras  and  role  themes  is  also  used  in  adding 
information  to  the  data-base  and  in  interpreting  information  retrieved 
from  the  data-base.  For  example,  knowing  that  Vance  is  a political 
dignitary  tells  us  that  he  will  do  things  to  advance  his  political 
career.  Suppose  we  want  to  add  to  the  data-base  that  Vance  vent  to  a 
party  with  other  cabinet  members.  Going  to  a party  is  normally  a 
social  event  that  belongs  in  a social  era,  but  in  this  case,  further 
investigation  of  the  event  show  that  the  other  people  invited  were 
political  colleagues  rather  than  social  friends  of  Vsnce.  One  rule  we 
have  about  career  role  themes  is  that  socializing  with  colleagues  la 
something  career  people  do.  We  would  therefore  put  the  event  in 
Vance's  political  career  era.  Later,  if  asked  %diy  he  went  to  the 
party,  we  can  use  role  theme  information  from  the  era  in  which  the 
event  is  stored  to  answer  that  he  went  to  the  party  to  maintain 
political  career  relationships. 

Role  themes  and  eras  are  important  because  they  tell  us  how  to 
structure  knowledge,  and  they  also  give  us  expectations  that  tell  us 
how  to  process  knowledge  both  for  question-answering  and  for 
memory-updating.  Representations,  storage  processes,  and  retrieval 
mechanisms  are  closely  related  through  scripts,  role  themes,  and  eras. 
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ORGANIZING  INFORMATION  ABOUT  PEOPLE 

Since  CYRUS  holds  primarily  biographical  Information  about 
people,  an  Important  knovledge  structure  In  this  data-base  Is  the 
person  frame*  The  person  frame  organises  all  the  Information  we  nay 
have  about  a person*  For  people  we  know  very  well,  the  person  frame 
la  very  densely  filled;  for  people  we  don't  know  as  well,  It  Is  more 
sparse*  Persons  have  names,  parents,  spouses,  children,  occupations, 
appearances,  attitudes,  histories,  etc*,  which  are  all  represented  In 
the  person  frame*  One  of  the  most  Important  properties  of  a person 
(perhaps  the  most  Important  In  this  system)  Is  his  history  or  life 
sequence.  Figure  1 (p*  15)  gives  an  example  of  some  of  the  properties 
Included  In  a typical  person  frame.  These  are  the  properties  that 
usually  Interest  a typical  friend.  Therefore,  Information  such  as 
office  address  and  social  security  number  are  not  Included  In  the 
person  frame,  while  name,  spouse's  name,  occupation,  and  hair  color 
are. 

A person's  life  cycle  can  be  represented  by  sequences  of  eras. 

Sequences  of  eras  are  viewed  as  the  personal  time  line  for  the  person 

% 

In  question*  Upon  hearing  about  a person,  w*  make  assumptions  about 
that  person  based  upon  what  we  know  about  possible  era  sequences*  We 
know  that  he  had  an  early  childhood  beginning  at  birth,  that  he  had  a 
family  and  certain  family  relationships,  and,  If  he  la  in  a western 
culture,  that  he  went  to  elementary  school,  high  school,  etc.  Upon 
hearing  about  a person  who  is  a lawyer,  we  combine  our  knowledge  about 
era  sequences  and  enablements  for  becoming  a lawyer  in  order  to 


I 


! 
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recognise  Chet  college  end  lew  school  were  eleo  eree  in  hie 


occupetlonel  sequence  of  eree 


An  ers  includes  the  role  theme  of  the  person  during  that  tlae 


tlae  pointers  to  other  parallel  era  sequences,  and  the  list  of 


biographical  events  that  happened  during  that  era  and  were  related  to 
the  associated  role  theae.  The  Hat  of  biographical  events  contains 


CD  repreaentatlona  for  events  and  aequencea  of  eventa  aa  well 


instances  of  scripts.  All  of  the  biographical  information  concerning 


an  era  can  be  found  in  its  list  of  events.  However,  soae  information 


is  more  Important  than  other  information,  and  for  reasons  of 


efficiency  of  retrieval,  that  information  is  represented  redundantly 


in  other  lists  sttached  to  the  era.  Important  information  Includes 


people  the  actor  is  in  contact  with  in  this  role  theme  and  his 


related  activities  and  hobbles  the 


relationships  to  those  people 


person  is  Involved  with  while  in  this  role  theme;  and  information 


about  how  this  person  differs  from  the  stereotypical  role  theme.  All 


of  the  information  in  theae  lists  of  people,  activities,  and  so  forth. 


can  also  be  found  by  looking  through  the  list  of  events.  However, 


retrieval  becomes  more  efficient  since  retrlevel  of  Important  data  is 


easier  than  retrieval  of  more  trivial  data,  certainly  a deslreable 


feature  in  a general  retrieval  system.  These  additional  lists  contain 


the  Important  information  that  is  brought  to  mind  in  thinking  about  an 


era,  and  thus  encode  information  that  is  most  likely  to  be  queried 


In  addition,  some  events  in  the  list  of  events  are  tagged  as  being 


Important  events,  i.e.,  those  most  likely  to  be  remembered  first  in 


thinking  about  the  era  and  those  most  likely  to  be  ssked  about 
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The  sene  redundancy  occur*  with  respect  to  the  person  frame. 
Most  of  the  properties  attached  to  the  person  frame.  Including  current 
occupation  and  family  relationships,  can  be  found  by  looking  at  the 
person's  life  sequence.  However,  current  Information  Is  likely  to  be 
accessed  more  than  past  Information,  so  it  Is  represented  a second 
time  in  the  person  frame.  This  enables  more  relevant  Information  to 
be  retrieved  more  easily. 

ORGANIZATION  AND  REPRESENTATION  OF  EVENTS  IN  CYRUS 

A person  frame  contains  pointers  to  all  of  the  Information  we 
know  about  a person.  In  CYRUS,  a person  frame  for  a particular  person 
is  represented  using  a token.  A token  is  an  atom  representing  an 
object  and  has  a list  of  the  object's  properties  attached  to  It.  A 
person  token  has  attributes  of  the  person  frame  attached  to  It  as  well 
as  a marker  specifying  that  It  represents  a person. 

A simplified  verson  of  the  token  for  Cyrus  Vance  appears  in 
figure  1.  Anything  with  prefix  "HUM"  Is  a pointer  to  another  person 
token.  Anything  with  prefix  "LOC"  is  a pointer  to  a location  token, 
and  anything  with  prefix  "CON"  Is  a conceptualization  (an  event,  a 
state,  an  Instance  of  a script,  era,  or  role  theme,  or  a sequence  of 
eras) . 
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HUMl:  CLASS  IPERSON 
FIRSTNAME  CYRUS 
MIDDLENAME  ROBERTS 
IASTMAME  VANCE 
GENDER  *MASC* 

OCCUPATION  CON27 
PROFESSION  CON 26 
BIRTHDATE  TIM1 
BIRTHPLACE  LOCI 
FATHER  HUN 11 
MOTHER  HUM 12 
WIFE  HUM4 

KIDS  (HUM 5 HUM 6 HUM 7 HUMS  HUM 9) 
EYECOLOR  BROWN 
HAIRCOLOR  GREY 
LIFE  CON 2 


figure  1 


Biographical  event8  about  a person  are  referenced  through  the 
property  LIFE  on  the  person  token*  The  property  LIFE  points  to  the 
person's  sequences  of  eras*  Figure  2 shows  part  of  Vance's 
occupational  sequence* 


CON 20:  CLASS  #ERA 

VALUE  ($$PR0FSCH00L  ACTOR  HUM1) 

DURATION  (ORDER  (3  YEARS)) 

ROLETHEME  C0N21 
EVENTS  (CON22  CON23  ***) 

C0N21:  VALUE  (RT-LAWSCHOOLSTUDENT  ACTOR  HUMl  SCHOOL  0RG1) 

CON25:  CLASS  #ERA 

VALUE  ($$CAREER  ACTOR  HUMl) 

DURATION  (ORDER  (10  YEARS)) 

ROLETHEME  CON26 
EVENTS  (CON 28  CON29  ***) 

CON26:  VALUE  (RT-LAWYER  ACTOR  HUMl  FIRM  ORG2) 

CON 30:  CLASS  #ERA 

VALUE  ($$POLITICAL-CAREER  ACTOR  HUMl) 

ROLETHEME  CON27 
EVENTS  (CON 32  CON33  *..) 

CON27:  VALUE  (RT-S  EC  -OF -STATE  ACTOR  HUMl  COUNTRY  POL1) 

CON 32:  VALUE  («->  ($VIPVISIT  ACTOR  HUMl  DESTINATION  POL5  ***))) 


figure  2 


Events  can  also  be  related  causally  through  causal  connectlvea 
and  temporally  through  relational  tine  links*  Each  event  In  aenory 
has  attached  to  It  one  or  aore  tine  specifications  relating  It 
teaporally  to  other  events  In  neaory*  Most  events  do  not  have  real 
tine  specified,  but  have  a fuzzy  tine  or  tine  relative  to  sone  other 
event*  Only  the  nost  Important  events,  or  sone  few  events  where  real 
tine  la  known,  have  real  tine  specified,  such  as  birth  and  hlgh-school 
graduation*  Sone  typical  specifications  would  be  "«Aien  I was  15  years 
old"  or  "2  years  after  we  aoved  to  Oklahona"  (represented,  of  course. 
In  CD)*  If  needed,  real  tine,  tine  intervals,  and  durations  can  be 
quickly  conputed  using  these  relational  tinea* 

Events  are  related  causally  through  cauaal  links  pointing  to 
other  states  or  events  in  aenory.  These  are  both  forward  and  backward 
links,  and  can  cross  era  boundaries  and  sequences  of  eras.  These 
links  are  necessary  in  answering  questions  concerned  with  causality* 
A sequence  of  events  is  related  causally  by  each  event  In  the  sequence 
pointing  to  the  next  and  previous  events  in  the  sequence.  Direct 
causality  such  as  enablenent,  reason,  or  general  causality  are 
represented  with  both  backward  and  forward  causal  links.  Usually,  in 
a sequence  of  events,  only  the  first  or  last  event  in  the  sequence  is 
Important  enough  to  be  in  the  EVENTS  lists  attached  to  each  era. 
However,  whenever  one  of  those  events  is  retrieved,  the  whole  sequence 
will  be  retrieved,  and  the  question  answerer  will  have  the  option  of 
answering  with  the  whole  sequence  or  any  part  of  it,  depending  upon 
the  question  type  and  what  the  user  has  already  been  told* 


ANSWERING  QUESTIONS 


Question  answering  is  one  of  the  prlaary  functions  of  CYRUS 


Question  answering  Involves  more  than  simple  retrieval  of  Information 


Inference,  knowledge  of  causality,  knowledge  of  intent,  and  general 


world  knowledge  are  all  part  of  the  retrieval  process.  In  addition 


the  question  anawerer  must  have  some  understanding  of  iriiat  the 


questioner  already  knows,  and  must  know  when  a question  makes  sense 


and  la  therefore  legitimate.  (Norman  (1972) 


Collins,  et  al 


(1975)) 


(Xiestions  are  represented  using  CD,  script,  and  role  theme 


representations,  l.e 


the  same  representatlona  used  In  memory 


order  to  understand  questions,  a queatlon  answerer  must  Interface  with 


a conceptual  analysis  program  which  parses  from  English  Into  CD.  The 


program  CYRUS  uses  la  baaed  on  one  written  by  Chris  Rlesbeck  (1975) 


Similarly,  in  order  to  output  answers,  a queatlon  answerer  must 


Interface  with  a generation  program  chat  translates  from  CD  back  Into 


English.  CYRUS's  generator  Is  modeled  after  one  designed  by  Neil 


Goldman  (1975).  CYRUS  Itself  works  with  Conceptual  Dependency  only 


and  la  designed  to  be  langusge-lndependent.  This  means  that  If  1 


Interfaced  with  a parser  or  generator  using  a language  other  than 


English,  It  could  understand  questions  or  generate  answers  In  that 


language.  Inference,  knowledge  of  causality  and  Intent,  scriptal  and 


era  knowledge,  and  processing  knowledge  are  all  inferred  from  Internal 


representations,  and  therefore  would  remain  the  same  regardless  of 


which  natural  language  was  used 
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Once  the  natural-language  queatlon  has  been  parsed  into  its  CD 
representation,  question  answering  begins.  The  question  must  be 
understood  and  the  memory  must  be  searched  for  an  appropriate  answer. 
The  question-answering  process  has  four  phases.  When  people  ask 
questions,  they  use  pronouns  and  other  references  to  previous 
dialogue.  Thus,  In  the  first  phase,  these  references  must  be  resolved 
using  focus,  context,  and  structures  of  previous  questions  and 
answers.  Also  In  this  phase,  tokens  used  in  the  question  are 
Identified  as  permanent  tokens  the  memory  recognizes.  In  the  second 
phase,  the  category  of  the  question  is  determined  to  discover  the  type 
of  Information  being  asked  for,  such  as  enablement  conditions, 
motivation  behind  an  event,  when  an  event  occurred,  etc.  Determining 
the  question  category  helps  In  Interpreting  the  Intent  of  the 
question,  Por  example,  determining  that  "Who  is  Cyrus  Vance?"  Is  an 
Identification  question  tells  us  that  the  answer  requires  a meaningful 
Identifying  characterization  of  Cyrus  Vance,  The  first  two  phases  can 
be  thought  of  as  understanding  the  question.  In  the  third  phase, 
memory  Is  searched  for  the  answer.  In  the  fourth  phase,  the  answer 
retrieved  from  memory  Is  further  Interpreted  so  that  the  generator  can 
give  a good  natural  language  answer.  Finally,  the  generator  generates 
the  answer  In  natural  language.  The  phases  are  divided  this  way  only 
to  make  discussion  of  the  question-answering  process  easier.  There  Is 
no  claim  that  people  process  questions  In  phases  such  as  these.  Most 
of  the  discussion  that  follows  will  deal  with  Interpreting  questions 
for  Intent  (l,e>,  understanding  the  question),  searching  memory  for 
the  answer,  and  specifying  which  Information  belongs  In  the  generated 
response « 


I 


RETRIEVAL  AND  RESPONSE  IN  CYRUS 
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Scripts,  role  theses,  and  eras  play  an  Important  role  In  CYRUS. 
Besides  being  used  for  representation  and  organization  of  knowledge  In 
memory,  they  are  also  used  to  organize  and  guide  question  answering 
and  memory  updating.  Many  problems  of  Inference  and  general  world 
knowledge  are  handled  through  the  processing  defined  by  these 
knowledge  structures.  They  provide  a way  of  organizing  Inferences  and 
providing  expectations  that  guide  the  processing  of  knowledge.  After 
using  Inferences  to  determine  that  a question  is  about  a particular 
script  or  era,  the  question  answerer  can  consult  processing 
Information  organized  under  that  structure  to  find  out  where  in  memory 
to  look  for  an  answer.  Determining  that  a question  is  asking  about  a 
particular  script  or  era  narrows  the  search  by  providing  Information 
about  both  which  era(s)  to  look  in  and  where  in  the  selected  era(s)  to 
look.  Scripts  and  eras  thus  provide  ways  of  determining  If  a question 
is  legitimate,  and  help  to  infer  the  Intent  of  a question. 

The  CYRUS  system  uses  the  same  basic  question  categories  and 
similar  heuristics  for  each  category  as  the  question  answerers 
(Lehnert,  1978)  for  the  SAM  (Cullingford , 1978)  and  PAM  (Vilensky, 
1978)  systems,  with  appropriate  changes  to  take  care  of  the 
organization  and  the  knowledge  structures  of  CYRUS.  Some  of  these 
question  types  are  more  applicable  to  CYRUS  than  others,  and  some 
categories  have  been  subdivided  because  of  their  specificity  in  this 
system.  The  important  categories  for  questions  in  CYRUS  are: 


. 


; 

s 


Identification 
Feature  Specification 
Enablement 

Inst  risen tal /Proc edur al 
Concept  Completion 
Tine 


Place 

Verification 

Duration 

Motivational 

Occurrence 

Result  Orientation 


Qiestlon  categories  are  determined  by  examining  the  CD 


representation  of  the  question*  After  the  category  Is  determined,  the 


question  concept  or  main  point  of  the  question  is  extracted  and  the 


retrieval  process  begins.  The  question  categories  will  each  be 


described  briefly  in  explaining  the  retrieval  process.  For  more 


Information  about  them  see  Lehnert  (1978) 


After  references  in  the  question  have  been  resolved  and  the 


question  category  has  been  found,  the  question  can  be  answered 


Processing  Is  different  for  each  question  category,  but  there  are 


similarities  In  processing  based  upon  the  memory  organization  and 


Interaction  of  the  question  answerer  with  the  memory.  Processing 


relies  on  script  structures,  era  structures,  knowledge  about  people 


and  objects,  and  combinations  of  these  things.  Processing  will  be 


explained  by  describing  In  general  the  Information  each  question  type 


Is  looking  for,  and  then  showing  specific  examples  of  how  the  question 


answerer  Interacts  with  the  memory  representations 


1.  Identification  questions 


Identification  questions  ask  for  further  specification  of  a 


token,  and  most  often  look  like  "Who  is  X?"  or  "What  Is  Y?"  An 


important  part  of  answering  these  questions  lies  In  Inferring  the 


Intent  of  the  question.  A person  asking  this  type  of  question  wants 


to  know  the  most  relevant  identifying  feature  of  the  person  or  object 


being  asked  about.  This  could  be  different  each  tine  the  question  Is 


asked,  depending  upon  the  context 


For  Instance,  Mrs.  Jones  Is 


Mary's  mother  to  Mary  and  her  friends,  but  she  Is  Johnnie's  teacher  to 


Johnnie  and  his  mother.  In  the  domain  of  CYRUS,  the  relevant 


identifying  characteristic  of  a person  is  usually  the  person's 


occupation  if  it  is  an  identifying  occupation,  otherwise  It  will  be 


the  person's  relation  to  an  important  person,  his  function,  or 


whatever  else  will  identify  him.  If  a name  is  not  mentioned  in  the 


question,  it  is  usually  assumed  that  the  questioner  is  asking  for  the 


name  of  the  person  being  described.  CYRUS  answers  these  questions  by 


identifying  the  token  asked  about  in  the  question,  and  then  extracting 


the  relevant  information  from  its  person  frame.  "Therefore,  in 


answering  questions  such  as  "Who  is  Vance's  wife?"  or  "Who  is  Vance's 


father? 


CYRUS  answers  with  "Grace 


In  the  question  "Who  is  Vance's  father? 


Vance"  or  "John  Vance 


'Vance' 8 father"  is  represented  as  the  male  parent  of  Vance.  Rules 


for  identifying  a parent  follow 


1.  If  the  child  is  known,  then  get  the  appropriate  parent 
from  his  property  list. 

2.  If  the  child  is  not  known,  search  the  list  of  known 
people  for  the  person  with  the  property  of  being  the  parent 
of  the  designated  person. 


Who  is  Vance's  father? 


Thus,  when  asked  the  question 


CYRUS  will 


look  on  the  Cyrus  Vance  token,  and  find  the  name  of  his  father 


had  not  been  there,  then  CYRUS  would  have  done  a memory  search  to  see 


if  any  person  listed  had  that  property.  The  processing  for  that 


question  is  as  follows 
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Who  is  Vance' a father? 

(QUESTION  IS  ((ACTOR  (f PERSON  KIDS  (HUM I) 

GENDER  *MASC*)  EQUIV  (*?*)))> 

(QUESTION  TYPE  IS  identification) 

(SEARCHING  TOKEN  HUM1  FOR  FATHER) 

(ANSWER  IS  HUM11) 

John  Vance 

When  the  nane  is  specified,  as  in  "Who  is  Cyrus  Vance?",  it  is  up 
to  the  question  answerer  to  pick  out  the  best  identifying  feature  of 
the  token,  as  explained  above.  In  this  case,  CYRUS  will  answer 
"Secretary  of  State  of  the  United  States,"  since  his  occupation  is 
Vance's  best  identifying  feature.  If  it  were  asked  "Who  is  John 
Vance?",  it  would  answer  "Cyrus  Vance's  father,"  his  relation  to  an 
important  person,  and  therefore  the  most  meaningful  way  of  identifying 
him*  A person  is  identified  by  determining  his  most  distinguishing 
property.  A partial  set  of  rules  for  answering  Identification 
questions  follow. 

If  the  name  Is  mentioned  In  the  question,  then  get  the  most 
distinguishing  feature  of  the  person  as  follows: 

1.  If  the  person's  occupation  is  distinguishing,  then 
Identify  him  by  his  occupation. 

2.  If  the  person  Is  related  to  an  important  person, 
then  identify  him  using  that  relation. 

3.  If  the  person  has  a function  important  to  what  is 
being  discussed,  then  identify  him  by  his  function. 

To  answer  some  Identification  questions,  CYRUS  must  consult 

information  encoded  in  scripts  or  role  themes.  For  Instance,  in  order 

to  identify  Cyrus  Vance  as  Secretary  of  State,  CYRUS  had  to  consult 

the  script  for  Secretary  of  State  to  find  out  If  It  was  a 

distinguishing  occupation.  Some  of  the  Information  It  has  about 

Secretary  of  State  is  that  It  Is  an  important  political  position 

appointed  by  the  President,  snd  there  is  only  one  of  them  in  a 
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country.  This  la  enough  information  to  tell  CYRUS  that  It  la  Indeed  « 
distinguishing  occupation  and  thus  should  be  used  to  answer  the 
question.  In  order  to  Identify  John  Vance  aa  Cyrus  Vance's  father, 
CYRUS  had  to  consult  family  information  from  the  person  frame  to  get 
the  list  of  family  relations.  Since  CYRUS  knows  that  father  la  a 
family  relation.  It  found  who  John  Vance  was  father  of,  and  thus 
determined  that  this  was  his  distinguishing  feature.  Some 
Intermediate  output  from  CYRUS  follows  showing  the  flow  of  control  In 
answering  those  questions. 

Who  Is  Cyrus  Vance? 

(QUESTION  IS  ((ACTOR  HUM1  EQUIV  *?*))) 

(QUESTION  TYPE  IS  Identification) 

(SEARCHING  TOKEN  HUM1  FOR  DISTINGUISHING  FEATURE) 

(SEARCHING  SECRETARY  OF  STATE  SCRIPT  FOR  DISTINGUISHING  FEATURE) 
(ANSWER  IS  ((ACTOR  HUM1  EQUIV  (IPERSON  OCCUPATION  CON27)))) 
Secretary  of  State  of  the  United  States 

Who  Is  John  Vance? 

(QUESTION  IS  ((ACTOR  HUM  11  EQUIV  *?*))) 

(QUESTION  TYPE  IS  identlf ication> 

(SEARCHING  TOKEN  HUM11  FOR  DISTINGUISHING  FEATURE) 

(SEARCHING  FAMILY  RELATIONS  FOR  IMPORTANT  RELATIONS) 

(ANSWER  IS  ((ACTOR  HUM11  EQUIV  (IPERSON  GENDER  *MASC*  KIDS 
(HUM  1 ) ) ) ) ) 

Cyrus  Vance's  father 


2.  Feature  Specification  questlona 


Feature  Specification  queatlons  ask  for  a feature  of  a person  or 
object.  They  are  answered  by  looking  at  the  token  for  the  person  or 
object  In  question,  and  then  locating  the  appropriate  feature.  Some 
of  the  same  processes  used  for  answering  Identification  queatlons  are 
used  to  identify  the  token  being  asked  about.  For  Instance,  If  the 
question  was  "What  color  were  Vance's  father's  eyes?",  the  procedures 
for  answering  Identification  questions  would  be  used  to  locate  the 


2 


i 
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token  for  Vance's  father,  where  information  about  hia  eye  color  would 
be  stored  (If  It  were  known).  Some  features  carry  additional 
information.  For  instance,  in  answering  "How  old  is  Vance?",  we  would 
invoke  the  following  rules: 

1.  Get  the  AGE  feature  from  the  actor's  token. 

2.  If  there  is  no  AGE  feature,  find  out  if  the  person  is 
alive  by  seeing  if  his  DEAD  feature  is  marked.  If  so, 
answer,  "Actor  is  not  alive",  or,  if  the  time  of  the  actor's 
death  is  known,  then  answer  "Actor  died  at  time  specified." 

3.  Otherwise,  try  to  find  out  when  the  actor  was  born  as 
follows  and  subtract  it  from  the  present  date. 

a.  Find  his  birthdate  by  looking  at  the  BIRTHDATE 
feature  of  his  token  or  checking  his  BIRTH  script  which 
is  at  the  beginning  of  his  EARLYCHILDHOOD  era. 

b.  If  no  birthdate  is  found,  look  at  the  timeline.  If 
there  is  a date  for  high-school  graduation,  subtract  18 
from  that  to  get  his  approximate  birthdate.  Otherwise, 
if  there  is  a date  for  his  graduation  from  college, 
subtract  22  from  that  to  get  his  approximate  birthdate. 
Otherwise,  if  there  is  a date  for  him  entering  his 
CAREER  era,  then  approximate  his  birthdate  at  between 
18  and  25  years  before  that.  Otherwise,  if  he  has  a 
marriage  date,  then  approximate  his  birthdate  as 
between  21  and  28  years  before  that. 

This  whole  procedure  is  Invoked  when  someone  asks  CYRUS  for  a person's 

age.  In  answering  "How  old  is  Vance?",  the  answer  is  found  by 

calculating  it  from  his  birthdate,  and  intermediate  printout  from 

CYRUS  is  as  follows: 

How  old  is  Cyrus  Vance? 

(QUESTION  IS  ((ACTOR  HUMl  IS  (AGE  VAL  (*?*))))) 

(QUESTION  TYPE  IS  feature  specification) 

(PROPERTY  ACE  NOT  FOUND,  CALCULATING  AGE) 

(FOUND  BIRTHDATE) 

(ANSWER  IS  ((ACTOR  HUH 1 IS  (AGE  VAL  (61))))) 

Sixty-one. 
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How  old  la  Vance's  father? 

(QUESTION  IS  ((ACTOR  HUMU  IS  (ACE  VAL  (*?*))))) 
(QUESTION  TYPE  IS  feature  specification) 

(PROPERTY  AGE  NOT  FOUND,  CALCULATING  AGE) 

(FOUND  DEATH) 

(ANSWER  IS  ((CON  ((ACTOR  HUM 11  IS  (DEAD  VAL  (T)))  AND 
((ACTOR  HUM11  IS 

(AGE  VAL  (APPROX  VAL  (85)))))))) 
He  is  not  alive,  but  would  be  about  85. 


3.  Enablement  and  Instrumental/Procedural  questions 

Enablement  questions  ask  for  enabling  conditions  for  events. 
Instrumental/Procedural  questions  ask  for  instrumentality  or  how 
something  was  accomplished.  Many  times  the  distinction  is  blurry. 
For  instance,  the  questions  "How  did  Vance  become  Secretary  of  State?" 
and  "How  did  Vance  become  a lawyer?"  would  be  categorised  as 
Instrumental/Procedural  questions,  but  could  also  be  interpreted  as 
Enablement  questions  without  changing  the  answer.  They  are  asking  the 
same  things  as  "How  was  Vance  able  to  become  Secretary  of  State?"  and 
"How  was  Vance  able  to  become  a lawyer?",  and  we  would  expect  the  same 
answers.  This  is  taken  care  of  in  CYRUS  by  the  script  and  role  theme 
structures.  The  representations  for  "How  did  Vance  become  a lawyer?" 
and  "How  was  Vance  able  to  become  a lawyer?"  both  make  reference  to 
the  lawyer  script.  In  answering  the  questions,  the  first  thing  we 
want  to  do  is  reference  the  lawyer  script  and  see  if  there  is  any 
special  processing  it  tells  us  to  do.  In  this  case,  the  lawyer  script 
tells  us  to  look  at  the  enablement  conditions  of  the  lawyer  role 
theme,  which  says  to  do  the  following: 


If  the  question  type  is  Enablement  or 
Instrumental/Procedural  and  if  the  procedure  is  being  asked 
about  a particular  person*  then  find  out  which  law  school 
that  person  went  to  by  getting  the  name  from  his  LAWSCHOOL 
script*  which  is  in  the  role  theme  of  his  PROFSCHOOL  era* 
and  answer  "He  went  to  X law  school."  If  there  is  no  name 
found*  Chen  answer  "He  went  to  law  school."  If  there  is  no 
LAWSCHOOL  script  or  PROFSCHOOL  era  found,  then  check  that 
the  person's  PROFESSION  or  OCCUPATION  is  lawyer.  If  so, 
then  answer  "He  went  to  law  school."  If  not,  then  answer 
"He  is  not  a lawyer.”  If  the  question  is  not  about  a 
particular  person*  then  answer  "by  going  to  law  school." 
Also,  if  he  is  a lawyer  then  find  which  state  he  passed  the 
bar  exam  in  by  looking  at  the  beginning  of  his  CAREER  era  or 
the  end  of  his  PROFSCHOOL  era  for  the  event.  If  you  find 
it,  add  "he  passed  the  bar  in  X state"  to  the  initial 
answer,  else  add  "he  passed  the  bar  exam".* 


Thus,  the  answer  to  "How  did  Vance  become  a lawyer?"  would  be  "He  went 


to  Yale  Law  School  and  parsed  the  New  York  bar 


Notice  that  if  the 


question  had  been  "How  does  one  become  a lawyer?",  then  it  could  have 
been  answered  by  the  same  procedure  and  the  answer  would  have  been  "by 


going  to  law  school  and  passing  the  bar."  The  same  holds  for  "How  did 


Vance  become  Secretary  of  State?"  and  "How  was  Vance  able  to  become 


Secretary  of  State?"  The  structure  of  the  script  that  holds 


information  about  being  Secretary  of  State  is  consulted  for  special 


processing  instructions 


How  did  Vance  become  Secretary  of  State? 

(QUESTION  IS  ((<->  (SSEC-OF-STATE  ACTOR  HUM1)  INST  (*?*)))) 
(QUESTION  TYPE  IS  Instrument/procedural) 

(SEARCHING  SIX-OF-STATE  SCRIPT  STRUCTURE  FOR  ANSWER) 

(SEARCHING  ENABLEMENT  FOR  SEC -OF -STATE  ROLE  THEME) 

(ANSWER  IS  ((<->  (SAPPOINT  APPOINTER  HUM15  APPOINTEE  HUM1  POSITION 
(RT-SEC-OF-STATE  ACTOR  HUM1  COUNTRY  P0L1))))) 

He  was  appointed  by  President  Carter. 


*Mote  that  answers  shown  in  the  rules  above  and  any  rules  that  follow  are 
only  a paraphrase  of  the  internal  CD  representation.  They  are  meant  to  get 
across  the  content  of  the  answer,  and  are  not  necessarily  what  would  be 

generated  by  the  generator. 
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4.  Concept  Completion,  Time  and  Place  Queationa 

Concept  Completion  questions  ask  for  the  missing  component  in  a 
question.  For  example,  "Who  went  to  the  Middle  East  with  Vance?"  asks 
for  the  group,  and  "Where  did  Vance  go?"  asks  for  the  place.  Time  and 
Place  questions  are  more  specific  categories  of  Concept  Completion 
questions,  requiring  additional  processing.  Some  of  these  questions 
have  the  added  complexity  of  also  dealing  with  scripts  or  other 
knowledge  structures.  For  example,  the  questions  "When  was  Vance 
born?"  and  "Where  was  Vance  born?"  are  Time  and  Place  questions 
respectively.  They  also  ask  for  information  from  the  BIRTH  script. 
Processing  algorithms  in  the  BIRTH  script  follow  for  Time  and  Place 
questions* 

If  the  question  type  is  TIME,  then 

1.  Get  the  BIRTHDATE  of  the  actor. 

2.  Otherwise,  look  at  his  BIRTH  script,  located  as  the 
first  event  in  his  EARLYCHILDHOOD  era.  Get  the  TIME 
off  his  BIRTH  script. 

3.  Otherwise,  answer  the  question  "How  old  is  the 
actor?"  by  following  the  algorithm  specified  above. 
Subtract  his  age  from  the  present  date  to  get  his  date 
of  birth. 

4.  Otherwise,  answer  "I  don't  know." 

If  the  question  type  Is  PLACE,  then 

1.  Get  the  BIRTHPLACE  of  the  actor. 

2.  Otherwise,  look  at  his  BIRTH  script,  located  as 
above.  Get  the  PLACE  off  his  BIRTH  script. 

3.  Otherwise,  answer  the  question  "Where  did  actor 
live  in  his  EARLYCHILDHOOD  era?".  Answer  "probably" 
and  that  place. 

4.  Answer  "I  don't  know." 
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Whan  waa  Cyrua  Vance  born? 

(QUESTION  IS  ((<»  ($BIRTH  ACTOR  HUM1))  TIME  (*?*))) 
(QUESTION  TYPE  IS  cine) 

(SEARCHING  BIRTH  SCRIPT  FOR  ANSWER) 

(SEARCHING  FOR  BIRTHDATE  OF  ACTOR) 

(ANSWER  IS  TIM1) 

March  27,  1917 

Where  waa  Cyrua  Vance  born? 

(QUESTION  IS  ((<»  ($BIRTH  ACTOR  HUM1))  PLACE  (*?*))) 
(QUESTION  TYPE  IS  place) 

(SEARCHING  BIRTH  SCRIPT  FOR  ANSWER) 

(SEARCHING  FOR  BIRTHPLACE  OF  ACTOR) 

(ANSWER  IS  LOCI) 

Clarksburg,  Weat  Virginia 

When  waa  Grace  Vance  born? 

(QUESTION  IS  ((<->  (SBIRTH  ACTOR  UUMl))  TIME  (*?*))) 
(QUESTION  TYPE  IS  tine) 

(SEARCHING  BIRTH  SCRIPT  FOR  ANSWER) 

(SEARCHING  FOR  BIRTHDATE  OF  ACTOR) 

(CALCULATING  AGE  OF  ACTOR) 

(ANSWER  IS  TIM100) 

Approximately  1920. 


Other  queationa  about  tine  are  not  anawered  quite  ao  eaaily.  For 
lnatance,  "When  waa  Vance  working  in  Cyprua?"  could  be  narrowed  down 
through  inferencea  to  hla  career  era  or  hia  political  career  era,  but 
then  all  eventa  in  thoae  eraa  would  have  to  be  searched  for  the 
anawer.  When  a matching  event  waa  found,  the  queation  answerer  would 
have  to  determine  how  to  anawer  the  queation  — with  respect  to  world 
events,  with  respect  to  other  things  going  cm  in  Vance's  life  at  the 
time,  with  respect  to  other  events  in  Vance's  political  life1,  or  with 
real  time.  Often  the  time  representation  attached  to  the  event  solves 
that  problem.  For  instance,  the  time  slot  for  Vance's  Cyprus  visit 
Includes  the  date  and  a pointer  to  the  major  world  event  that  waa 
going  on.  If  there  is  no  specific  date,  however,  the  time  responae 
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will  have  to  be  determined  by  the  Inferred  Intent  of  the  question.  A 
question  about  Vance's  political  career  would  be  answered  with  respect 
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to  the  major  world  event  going  on  at  the  time  or  in  relation  to  other 
political  appointmenta  he  has  had.  A question  about  his  family  life 
would  be  answered  in  terms  of  his  career  since  that  is  what  we  (and 
therefore  CYRUS)  focus  cm  in  thinking  about  Vance.  Time  questions 
asked  about  people  other  than  Vance  would  be  answered  in  terms  of  the 
part  of  that  person's  life  that  is  considered  most  important  or  in 
terms  of  the  person  the  questions  are  focused  on.  Thus,  a good  answer 
to  the  question  "Vhen  did  Vance's  father  die?"  would  be  "When  Vance 
was  five  years  old."  "When  did  Vance  get  married?"  would  be  answered 
by  "Soon  after  he  became  a lawyer."  It  is  the  Intent  of  the  question 
that  determines  how  it  should  be  answered.  In  answering  questions 
about  time  In  CYRUS,  questions  are  usually  answered  with  respect  to 
the  occupational  era  sequence. 

Questions  about  place  are  answered  in  a similar  way  to  time 
questions.  As  in  the  example  above,  if  the  match  found  to  the  event 
in  the  question  does  not  have  a place  attached,  then  the  time  atom  and 
the  residence  list  on  the  person's  token  can  be  used  to  determine 
place.  Any  script  or  era  mentioned  in  the  question  is  always 
consulted  first,  however,  before  doing  any  general  searches  or 
calculations.  Answering  questions  about  place  requires  knowledge  of 
Intent  and  Importance  of  activities.  In  answering  the  question  "Where 
did  Vance  go  to  law  school?",  the  law-school  student  script  is 
consultsd  first.  Processing  rules  from  that  script  tell  CYRUS  that 
the  question  is  asking  for  an  identification  of  the  law  school  Vance 
attended.  It  is  therefore  more  appropriate  to  answer  "Yale"  than  to 
answer  "in  Mew  Haven".  In  answering  ths  question  "Where  was  Vance 
last  week?",  it  must  be  determined  that  the  best  answer  is  the  country 


he  was  in  last  week.  A number  of  correct  answers  could  be  given, 
including  "in  an  airplane",  "in  a hotel",  "in  Israel",  and  "in  the 
universe",  but  only  the  third  one  is  the  relevant  answer.  Place  oust 
be  determined  in  terms  of  an  Important  activity  Vance  was  doing,  in 
this  case  a visit  to  Israel.  Similarly,  in  answering  the  question 
"Where  is  Vance  now?",  the  same  type  of  processing  must  be  done.  If 
there  is  no  information  in  the  data-base  about  where  Vance  is  at  a 
particular  time,  then  knowledge  coming  from  the  role  theme  he  occupies 
at  that  time  is  used  to  supply  an  answer.  For  example,  CYRUS  has 
enough  information  about  Secretaries  of  State  to  know  that  if  he  is 
not  on  a trip,  then  the  Secretary  of  State  is  probably  in  his  nation's 
capital,  or  at  least  in  his  own  country.  In  Vance's  case,  this  is  the 
United  States,  and  CYRUS  can  therefore  answer  "In  the  United  States". 
Knowledge  of  Vance  as  somebody  who  travels  a lot  (since  he  is 
Secretary  of  State)  tells  CYRUS  that  the  best  way  to  answer  questions 
asking  for  his  whereabouts  is  usually  to  name  the  country  he  is  in. 

5.  Verification  questions 

Verification  questions  are  yes/no  questions.  In  answering  these 
questions,  sometimes  it  is  enough  to  search  for  a match  in  memory 
after  the  search  space  has  been  narrowed.  However,  most  of  the  time 
that  is  not  enough.  More  than  a simple  yes  or  no  is  needed  to  answer 
these  questions,  so  further  interpretation  of  the  question  is  needed 
to  determine  what  else  is  being  aaked  for.  For  Instance,  if  asked 
"Does  he  have  any  children?",  an  appropriate  reinterpretation  is: 
"Does  he  have  any  children,  and  if  so,  how  many  or  what  are  their 
names?”  If  asked,  "Is  he  married?",  an  appropriate  way  to  reinterpret 


Che  question  is:  "Is  he  married,  and  if  so,  to  whom?"  In  CYRUS,  that 

information  comes  from  knowledge  about  family  relationships  which  can 

be  found  in  the  person  frame.  When  a Verification  question  is  asked 

about  a variable  in  a script  or  a slot  in  the  person  frame,  it  is 

assumed  that  there  should  be  some  further  specification  about  that 

variable.  When  a Verification  question  is  asked  about  the  existence 

of  an  instance  of  some  script,  further  information  should  be  given 

about  that  script.  Processing  family  relationships  is  as  follows: 

If  the  question  type  is  Verification,  and  the  unknown  is 
KIDS,  and  the  initial  answer  is  "yes",  then  get  the  list  of 
kids  if  it  is  known  else  the  number  of  kids  and  append  their 
names  if  known  and  there  are  less  than  four,  else  append 
their  number  to  the  initial  answer. 

If  the  question  type  is  Verification  and  the  unknown  part  is 
spouse,  and  the  initial  answer  is  "yes"  then  append  the 
answer  to  "Who  did  the  actor  marry?"  if  known  to  the  initial 
answer,  else  append  the  answer  to  "When  did  the  actor 
marry?"  to  the  initial  answer. 

In  a question  such  as  "Has  he  ever  been  to  France?",  inferences 
off  the  primitive  act  PTRANS  help  to  answer  the  question. 
Reinterpreting  this  question  using  available  inferences,  CYRUS 
understands  the  question  as  "Has  he  ever  taken  a trip  to  France?"  and 
knows  to  look  for  Instances  of  the  TRIP  script.  Algorithms  for 
processing  in  the  TRIP  script  follow: 

1.  If  the  question  type  is  verification  and  the  initial 
answer  is  "yes",  then  append  the  answer  to  "When  was  the 
trip?"  to  the  initial  answer. 

2.  If  the  trip  was  part  of  a CAREER  era  or  MILITARY  SERVICE 
era,  then  append  the  answer  to  "What  was  the  purpose  of  the 
trip?"  to  the  answer.  Otherwise,  if  the  trip  was  before 
marriage,  then  append  the  answer  to  "Who  else  went  on  the 
trip?"  to  the  answer. 

If  an  answer  is  not  found  through  checking  instances  of  the  TRIP 
script,  then  by  using  other  Inferences,  the  question  can  be 


reinterpreted  as  "Has  he  ever  lived  in  France?",  and  the  residence 


list  of  the  person  can  be  checked.  If  France  were  on  the  residence 


list,  then  CYRUS  would  include  in  the  answer  the  information  that  the 


trip  was  for  the  purpose  of  residing  in  France 


6.  Duration  questions 


Duration  questions  are  those  questions  that  ask  for  calculation 
of  time  duration.  The  questions  "How  long  has  Vance  been  Secretary  of 


State?"  and  "How  many  years  has  it  been  since  Vance  became  Secretary 


of  State?"  are  both  answered  by  finding  the  time  when  he  began  being 


Secretary  of  State,  and  subtracting  it  from  the  present  date 


long  was  Vance  in  the  Mid-East?"  is  answered  by  getting  the  start  time 
and  the  end  time  of  his  trip  and  subtracting.  Notice  that  these  are 


both  the  same  calculation.  In  the  first  question,  the  end  time  was 


now.  The  rule  for  answering  Duration  questions  Is 


1.  get  the  start  time 

2.  get  the  end  time 

3.  subtract  the  start  tii 


Many  times,  getting  the  start  tii 


a time  calculation.  For  ex. 


Involves 


long  has  it  been  since  Vance  graduated  law  school?",  the  question 


(a  Time  question)  must  be 


"When  did  Vance  graduate  law  school? 


answered  to  get  the  start  time.  If  the  data-base  did  not  explicitly 


have  that  date,  then  the  approximate  time  of  Vance's  graduation  would 


explained  in  the  section  (A)  above  by  making 


be  calculated 


Inferences  from  when  he  graduated  college  or  when  he  was  born.  Thus, 


processing  associated  with  scripts  and  eras  can  be  used  for 


calculating  start  and  end  times  of  events  and  eras.  Some  scripts. 
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such  as  the  VIPVISIT  script,  have  start  and  end  times  as  script 

variables.  When  that  is  the  case.  Time  questions  do  not  have  to  be 

answered  to  calculate  start  and  end  times. 

How  long  was  Vance  in  Israel? 

(QUESTION  IS  ((ACTOR  HUM1  IS  (*LOC*  VAL  P0L1)) 

TIME  ((BEFORE  *NOW*  X))  DURATION  (*?*))) 

(QUESTION  TYPE  IS  duration) 

(MATCHED  QSTAT  TO  $VIPVISIT) 

(SEARCHING  $VIPVISIT  FOR  ANSWER) 

(CALCULATING  DURATION  FRGM  ARRTIME  AND  DEPT  DIE) 

(ANSWER  IS  ( (*DAYS*  6))) 

Six  days. 

7.  Occurrence,  Result  Orientation  and  Motivational  questions 

Occurrence  questions  ask  what  followed  after  an  event,  as  in 
"What  happened  when  Vance  went  to  the  Mid-East?"  These  questions  are 
answered  by  finding  the  event  mentioned  in  the  question  and  following 
the  causal  chain  from  that  event.  Any  forward  causal  chain 
connections  are  retrieved  in  answering  these  questions. 

Result  Orientation  questions  ask  what  resulted  from  a particular 
event,  as  in  "What  resulted  from  Vance's  Mid-East  visit?"  These 
questions  are  answered  by  finding  the  event  in  the  question  and 
following  LEADTO,  REASON,  INITIATE,  and  ENABLE  links  in  the  causal 
chain.  The  difference  between  processing  for  these  questions  and 
Occurrence  questions  is  that  Occurrence  questions  can  also  get  their 
answers  from  NEXT  links. 

Motivational  questions  ask  for  the  motivation  or  reason  behind 
doing  an  act,  as  in  "Why  did  Vance  go  to  the  Mid-East?"  As  in 
answering  Occurrence  and  Result  Orientation  questions,  CYRUS  finds  a 
match  in  the  event  list  of  the  appropriate  era,  and  then  looks  at  the 
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causal  chain  coning  from  that  event.  In  the  case  of  Motivational 
questions,  CYRUS  follows  all  COMESFRGM  links  from  the  matched  event  to 
see  if  any  of  the  COMES FROM  links  point  to  events  which  could  have 
motivated  the  matched  event.  A COMESFROM  link  points  backwards  to 
events  that  are  causally  connected  by  REASON,  INITIATE,  or  LEADTO 
connectives.  An  event  that  has  a REASON  or  INITIATE  connective 
pointing  to  the  matched  event  will  answer  a Motivational  question.  In 
addition,  some  scripts  usually  have  goals  attached  as  script 
variables.  The  VIPVISIT  almost  always  has  a goal  attached  and 
VIPVISIT  processing  would  retrieve  it. 


For  any  of  the  question  types  discussed  above,  if  the  answer 

cannot  be  found  by  the  methods  specified,  then  further  inferences  can  ^ 

be  used.  For  example,  the  questions  above  are  all  instances  of  the 

VIPVISIT  script,  and  knowledge  of  the  VIPVISIT  script  tells  CYRUS  how 

to  do  further  processing  as  follows: 

If  the  question  type  is  Occurrence  or  Result  Orientation, 
and  an  answer  has  not  been  found  from  the  causal  chain,  then 
look  in  the  world  events  timeline  at  the  time  corresponding 
to  the  trip,  and  see  if  anything  significant  happened  in  the 
part  of  the  world  where  the  actor  was  at  that  time.  If  so, 
use  that  for  the  answer. 

If  the  question  type  is  Motivational,  and  an  answer  has  not 
been  found  from  the  causal  chain  or  attached  to  the  script 
instance,  then  look  at  the  world  events  timeline  at  the  time 
corresponding  to  the  trip,  and  see  if  there  were  any 
problems  going  on  in  that  part  of  the  world  at  that  time. 

If  so,  use  that  as  the  answer. 

Using  the  world  events  timeline  to  answer  questions  means  that  it  must 

. 

be  continuously  updated  along  with  the  rest  of  the  data-base  in  order 
to  answer  questions  correctly. 


If  all  of  Chaae  methods  fall.  Motivational  questions  can  be 


answered  by  looking  at  role  theme  Information.  If  the  event  In 


question  la  a typical  one  for  the  role  theme  the  person  la  Involved 


In,  then  CYRUS  can  use  the  reasons  associated  with  that  act  In  the 


role  theme  (if  they  exist).  If  there  are  no  reasons  associated,  and 


It  is  an  act  typically  done  by  somebody  in  the  role  theme,  then  it  can 


use  the  default  answer  "because  person  x is  in  role  theme  y 


CYRUS  would 


example,  if  asked  "Why  did  Vance  go  to  England? 


determine  that  it  wae  a VIPVISIT  Instance.  If  it  could  not  find  any 


reason  for  the  visit,  then  its  knowledge  of  the  political  dignitary 


role  theme  would  tell  CYRUS  that  political  dignitaries  normally  go  on 


VIPYISITs  in  order  to  improve  relations  between  countries  or  for  good 


will.  Since  CYRUS  knows  relations  with  England  don't  need  improving 


it  assumes  he  went  on  a goodwill  mission,  and  answers  accordingly 
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CONCLUSIONS 


CYRUS  la  a data-base  system  designed  for  natural  language 
inquiry.  Natutol  language  inquiry  requires  a conceptual  data-base, 
since  it  is  the  Intent  of  the  questions,  not  necessarily  the 
individual  words  used  in  asking  the  questions  that  is  laportant. 
Question  answering  using  conceptual  information  involves  muen  sora 
than  the  mere  retrieval  of  information.  Knowledge  must  be  stored  in  a 
meaningful  way  so  as  to  help  with  the  retrieval  process. 

CYRUS  makes  use  of  a theory  of  human  memory  organisation  in  order 
to  store  end  retrieve  information  from  its  data-base.  Memory  ia 
organised  through  knowledge  structures  called  scripts,  role  themes, 
eras,  and  person  frames.  The  question  answering  process  consists  of 
interpreting  the  question  by  finding  the  cc^ect  question  category, 
and  processing  the  question  using  inferences  organised  in  these 
knowledge  structures.  This  model  of  memory  organisation  and  retrieval 
has  proven  to  be  sufficient  for  dealing  with  knowledge  of  events  and 
simple  causations  in  the  domain  chosen.  It  can  be  easily  extended  to 
Include  other  domains  by  adding  new  script  and  role  theme  structures, 
and  new  processing  rules  to  the  memory,  without  changing  anything  else 
In  the  system. 

CYRUS  is  small,  and  there  has  been  no  need  to  refine  low-level 
indexing  schemes,  such  as  organisation  of  events  within  the  eras.  In 
creating  a larger  syatem,  minimising  search  within  eraa  will  become 
important.  When  FRUMP  begins  updating  CYRUS  from  the  UPI  wire,  there 
will  be  a large  number  of  events  put  into  Vance's  current  political 


career  ere.  Ac  Chet  time,  problems  of  Indexing  within  eras,  updating 
and  debugging  false  or  partially  true  Information,  deciding  when  to 
forget  low-level  details  of  atoriea,  and  representing  and  handling 
more  complex  time  and  time  relations  will  need  to  be  addressed  In  more 
detail.  These  problems,  and  others,  are  currently  being  worked  on, 
and  their  solutions  will  be  Included  In  a complete  CYRUS  system, 
Including  both  question  answering  and  updating  modules,  which  should 
be  running  within  the  next  year, 
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